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BUSINESS  LEVEL  DATA  DISCLOSED  UNDER  FASB  NO.  14: 
EFFECTIVE  USE  IN  STRATEGIC  MANAGEMENT  RESEARCH 

In  response  to  the  observed  sub-optimal  use  of  the  Compustat  II 
line-of -business  database,  we  examine  that  database  in  the  context  of 
three  issues  critical  to  strategic  management  research: 
diversification,  industry  analysis,  and  vertical  integration.   Our 
analysis  should  help  researchers  protect  the  integrity  of  studies  based 
on  this  increasingly  popular  database. 
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Research  on  multibusiness  firms  has  long  been  hampered  by  the 
absence  of  firm- specif ic  data  aggregated  at  the  line  of  business  level. 
Thus,  appearance  of  the  COMPUSTAT  II  Line  of  Business  database,  which 
contains  firms'  disclosure  of  information  required  by  FASB-SFAS  No.  14, 
was  heralded  by  strategic  management  and  other  researchers.   This  paper 
reports  research  on  that  data  set  of  increasing  importance  to  strategic 
management  researchers.   Using  the  line-of -business -level  (or  segment) 
data  in  COMPUSTAT  II  in  the  most  effective  manner  for  research  is 
considered  in  the  context  of  three  issues  critical  in  much  strategic 
management  research:   1)  calculation  of  the  related  and  unrelated 
components  of  diversification  to  assess  the  extent  and  type  of 
diversification,  2)  assessment  of  industry  trends,  and  3)  evaluation  of 
the  presence  of  vertical  integration. 

Several  trends  have  converged  to  necessitate  attention  to  proper 
use  of  this  data  set.   First,  the  availability  of  data  disaggregated  to 
the  line  of  business  level  has  enabled  researchers  to  address 
interesting  and  important  research  questions  for  which  appropriate  data 
was  previously  unavailable  in  the  public  domain.   Second,  interest  in 
diversification,  acquisition,  divestment  and  related  topics  has  been 
high,  consistent  with  greater  incidence  of  those  phenomena  in  the 
corporate  world.   As  a  result,  researchers  have  approached  this  data 
set  with  enthusiasm.   Attractive  and  useful  as  the  data  is,  careful 
attention  to  its  characteristics,  understanding  of  its  composition,  and 
thus  compensation  for  its  limitations  is  necessary  to  protect  the 
integrity  of  research  using  this  data  set  and  the  value  of  such 
research  results. 


This  paper  briefly  introduces  and  describes  the  nature  of  the 
database,  discusses  and  illustrates  certain  common  pitfalls  researchers 
should  avoid,  and  explains  appropriate  methods  for  correct  use  of  this 
valuable  data. 

COMPOSITION  OF  THE  COMPUSTAT  II  DATA  SET 

The  COMPUSTAT  line  of  business  database  is  compiled  from  firms' 
annual  reports  and  10 -K  reports  to  the  Securities  and  Exchange 
Commission  (SEC) .   Disclosure  of  the  financial  information  in  this 
database  is  required  by  FASB-SFAS  No.  14  "Financial  Reporting  for 
Segments  of  a  Business  Enterprise."  This  accounting  standard  defines  a 
segment  as :  "A  component  of  an  enterprise  engaged  in  providing  a 
product  or  service,  or  a  group  of  related  products  or  services 
primarily  to  unaffiliated  customers  (i.e.,  customers  outside  the 
enterprise)  for  a  profit."   Since  the  institution  of  these  segment 
reporting  requirements,  Standard  and  Poors'  COMPUSTAT  Service  has  been 
compiling  the  segment  information  of  more  than  6,000  publicly  traded 
companies,  including  all  companies  traded  on  the  NYSE,  ASE  and  OTC,  in 
the  line  of  business  database  (COMPUSTAT  II  Line  of  Business  Data) . 

FASB-SFAS  14  requires  only  that  each  company  identify  each  of 
its  segments  by  name.   For  the  purposes  of  more  detailed  and  comparable 
descriptions,  COMPUSTAT  (S&P  personnel)  assigns  a  maximum  of  two  4- 
digit  SICs  to  each  segment  (SSIC1  and  SSIC2).   This  further 
disaggregation  of  the  data  (identification  of  lines  of  business  within 
the  FASB-required  segments)  has  been  viewed  favorably  (and  correctly 


so)  by  researchers  interested  in  business -level  strategic  issues. 
However,  herein  lies  the  potential  for  misuse  and  abuse  of  this 
database,  which  may  lead  to  erroneous  research  results. 

First,  it  is  important  to  note  that  COMPUSTAT  (Standard  and  Poors, 
not  the  companies)  identifies  the  businesses  by  SIC  codes  (SSIC's).   It 
is  reasonable  to  assume  that  COMPUSTAT' s  SSIC  code  assignments  are 
carefully  and  consistently  executed.   The  COMPUSTAT  II  Line  of  Business 
Manual  explains  that  "SSICs  are  based  on  the  activites  of  the  segments 
as  described  by  the  company  in  its  annual  report  or  10 -K.   The  first 
SIC  should  be  considered  the  primary  SIC  of  the  segment .... SPCS  will 
attempt  to  assign  two  SIC  codes,  for  each  industry  segment." 
(COMPUSTAT  II  Section  5-A,  p. 26).   It  is  important  to  recognize, 
however,  that  distance  between  SSIC  codes  of  businesses  within  a 
segment  should  not  be  viewed  as  unrelatedness  of  those  businesses;  the 
company  has  indicated  that  such  businesses  belong  to  "a  group  of 
related  products  or  services"  (FASB  segment  definition,  above)  by 
joining  them  in  a  segment. 

Second,  it  is  important  to  understand  the  second  SSIC  code  for 
each  segment  and  the  relationship  between  the  two  lines  of  business  per 
segment  as  identified  by  COMPUSTAT.   Some  previous  researchers  have 
treated  the  lines  of  businesses  within  segments  as  separable  (allowing 
the  presence  of  different  SSIC  codes  to  override  the  fact  that  the  two 
lines  of  business  are  housed  by  the  firm  in  the  same  segment) .   For 
some  research  questions,  such  separation  of  segment  lines  of  businesses 
may  be  correct,  but  for  many  questions  of  importance  to  strategic 
management  (type  of  diversification,  extent  of  vertical  integration), 


such  separation  would  be  a  serious  error.   As  noted  above,  the 
companies  provide  descriptions  of  their  lines  of  business  (on  which 
COMPUSTAT  bases  its  SSIC  assignments)  in  annual  reports  and  lOK's,  and 
they  assert  relatedness  among  some  lines  of  business  through  grouping 
certain  businesses  together  as  a  segment  (see  FASB  segment  definition 
above) . 

Thus,  we  argue  that  the  second  SIC  code  (SSIC2)  may  be  assumed  to 
denote  an  activity  related  to  the  manufacture  or  service  of  the 
activity  in  the  primary  SIC  (SSICl) .   Support  for  this  assumption  is 
based  on  the  fact  that  a  segment  comprises  of  "a  component  of  an 
enterprise  engaged  in  providing  ...  a  group  of  related  products  or 
sevices  to  unaffiliated  customers  ..."  (COMPUSTAT  II  Section  2,  p. 2). 
Within- company  sales  do  not  comply  with  the  requirement  of  selling  to 
"unaffiliated  customers."  Therefore,  by  definition,  vertical  or 
horizontal  integration  activity  has  to  be  assigned  to  the  segment  of 
the  end  product. 

The  Compustat  documentation  refers  to  yet  another  level  of 
disaggregation,  the  PSIC  (product  SIC),  but  researcher  should  note  that 
PSICs  are  also  assigned  by  S&P,  disclosure  by  firms  at  that  further 
level  of  disaggregation  is  not  required  by  law.   Thus,  the  PSIC  data 
are  quite  spotty  and  of  questionable  consistency  and  accuracy. 

USING  COMPUSTAT  DATA  IN  VERTICAL  INTEGRATION  RESEARCH 

Because  the  assumption  of  relatedness  within  firms'  segments  is 
critical  to  the  contribution  of  this  paper,  we  conducted  an  analysis  of 


the  line  of  business  database  to  determine  whether  further  support  for 
the  assumption  existed.   The  procedures  and  results  of  that  analysis 
will  be  described  next. 

In  our  analysis,  we  compared  SSIC1  and  SSIC2  for  all  segments  in 
the  database  for  the  years  1979-85  (availability  of  COMPUSTAT  II  data 
begins  with  1979) .   Tables  1  and  2  illustrate  the  crosstabulation 
between  SSIC1  and  SSIC2  at  the  2 -digit  and  3 -digit  levels, 
respactr\e]y.  As  shwi,  30.5%  of  the  sqgBrts  had  h 


Insert  Tables  1  and  2  about  here 


only  a  primary  SIC  (SSIC2  was  0  --  COMPUSTAT  assigns  a  maximum  of  two 
SIC  codes  per  segment).   As  that  30.5%  are  single -business  segments, 
they  are  not  central  to  this  paper.   It  is  reasonable  to  assume  some 
sort  of  relatedness  for  two  other  groups  which  emerged  from  the  data: 
segments  with  both  the  primary  and  secondary  businesses  (SSIC1  and 
SSIC2)  in  the  same  3-digit  SIC  (10%  of  all  segments)  and  the  28%  of  all 
segments  with  both  SSIC1  and  SSIC2  in  the  same  2-digit  SIC.   (It  is 
interesting  to  note  that  when  the  SIC  match  is  relaxed  from  the  3-digit 
level  to  the  2-digit  level,  the  proportion  of  segments  with  both 
businesses  in  the  same  SIC  increases  from  10%  to  28%.) 

We  then  further  examined  the  41.8%  of  all  segments  whose  primary 
and  secondary  businesses  did  not  fall  in  at  least  the  same  2-digit  SIC. 
Erroneous  classification  of  a  segment's  businesses  as  unrelated  seemed 
most  likely  to  occur  within  this  (fairly  large)  group.   We  examined  the 
data  for  vertical  integration  relationships  within  firms'  segments,  as 


FASB  reporting  requirements  specify  that  segments  must  be  formed  such 
that  they  provide  "products  or  services  to  unaffiliated  customers" 
(thus  any  vertical  or  horizontal  integration  the  firm  engages  in  must 
be  housed  within  segments) . 

Vertical  integration  in  a  segment  may  be  of  two  types.   One  type 
occurs  in  instances  such  as  those  where  metallic  ore  extraction  and 
metal  manufacturing  are  in  the  same  segment,  or  where  petroleum 
extraction  and  petroleum  wholesale  distribution  are  in  the  same 
segment.   The  other  type  of  vertical  integration  occurs  when  a  segment 
includes  activities  where  manufacturing  output  from  one  2 -digit  SIC 
becomes  input  for  manufacturing  in  a  different  2 -digit  SIC.   This  would 
be  the  case  for  a  segment  identified  by  SSIC  2200  (textile  mill 
products)  and  SIC  2330  (women's  apparel). 

The  first  type  of  vertical  integration  is  relatively  simple  to 
discern.   Identification  of  vertical  integration  in  a  segment  is  made 
when  one  of  the  SSIC  is  in  raw  material  (SIC  0100  -  1999) , 
manufacturing  (SIC  2000  -  3999),  or  service  (SIC  4000  -  9999),  while 
the  other  SSIC  belongs  to  one  of  the  other  two  areas;  it  may  be  assumed 
that  the  two  businesses'  presence  in  the  same  segment  is  an  indication 
of  vertical  integration.   For  instance,  a  segment  having  SSIC1-2020  and 
SSIC2-5143  is  forward  integrated,  because  SIC-2020,  the  primary  SIC,  is 
the  manufacture  of  dairy  products  and  SIC- 5143,  the  secondary  SIC,  is 
the  wholesale  of  dairy  products.   Our  analysis,  summarized  in  Table  3, 
indicates  that  for  at  least  36%  of  the  segments  in  question  (15%  of  the 
total  number  of  segments) ,  the  primary  and  secondary  businesses  were 
related  by  this  first  type  of  vertical  integration. 
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Insert  Table  3  about  here 

The  second  method  of  establishing  vertical  integration  is  more 
complex.   For  the  remaining  26.5%  of  segments  (those  for  which  SSIC1 
and  SSIC2  were  not  in  the  same  2 -digit  SIC  and  were  not  in  adjacent 
industry  stages) ,  both  the  primary  and  secondary  businesses  were  in  raw 
materials,  manufacturing  or  service.   A  random  check  of  these  segments 
showed  that  a  significant  proportion  of  even  these  segments  (85%  of 
those  checked)  had  some  related  integrated  activity,  upstream  or 
downstream.   For  example,  a  segment  with  SSICs  3721  and  3664  might 
appear  to  consist  of  unrelated  manufacturing  activity,  if  one  looks 
only  at  similarity  of  SSICs.   Yet,  closer  inspection  reveals  the  firm's 
logic  in  assigning  these  activities  to  a  single  segment:   SIC- 3721  is 
aircraft  manufacturing  and  SIC- 3664  is  the  manufacture  of  search, 
detection,  navigation  and  guidance  sytems  and  equipment.   Activity  in 
SIC- 3664  provides  critical  instrumentation  used  in  all  types  of 
aircraft,  especially  defense  aircraft,  thus  this  segment  contains 
vertically  integrated,  not  unrelated,  businesses. 

Research  on  vertical  integration  that  raises  questions  about  the 
type  or  extent  of  vertical  integration  in  certain  industries  can  thus 
make  use  of  COMPUSTAT  line-of -business  data  very  effectively.   The 
above  analysis  has  reconfirmed  that  activity  reported  by  firms  as  being 
associated  with  a  single  product  or  group  of  related  products  is  indeed 
so  despite  the  fact  that  the  varied  SSICs  within  segments  can  give  the 
appearance  of  unrelated  businesses.   To  recapitulate  the  findings  from 


Tables  1  and  2,  an  analysis  of  all  6,007  firms  on  the  COMPUSTAT  II 
line-of -business  database:  58%  of  segments  had  SSIC1  and  SSIC2  in  the 
same  2 -digit  code,  and  nearly  half  of  the  remaining  42%  were  vertically 
integrated  segments  with  SSIC1  as  the  primary  activity  of  the  segment. 
Therefore,  a  segment  is  best  described  by  the  primary  SIC  of  the 
segment  (SSICl)  at  the  two  digit  level.   Splitting  a  firm's  segments 
into  the  two  SSICs  assigned  to  each,  and  treating  them  as  separable 
businesses,  as  has  been  done  in  some  studies,   inappropriately 
increases  the  measure  of  strategic  diversity  of  that  firm's  activity. 

USE  OF  COMPUSTAT  DATA  IN  INDUSTRY  ANALYSIS 

COMPUSTAT  II  data  lends  itself  to  the  calculation  of  industry 
trends,  with  respect  to  strategic  diversity,  vertical  integration,  and 
many  other  issues.   This  is  especially  useful  when  research  questions 
require  that  industry  trends  be  studied  in  conjuction  with  firms'  data, 
or  with  firms  disaggregated  at  the  segment  level,  as  compatability  of 
databases  is  a  critical  consideration  in  such  instances. 

In  past  studies  requiring  industry  level  information,  data  from 
the  Census  of  Manufacturers  were  most  commonly  used.   However,  this 
data  has  limitations  which  render  it  inappropriate  for  use  in 
conjunction  with  COMPUSTAT  line  of  business  data.   First,  the  Census 
cautions  that  the  value  of  shipments  is  not  accurate  at  the  3-digit  and 
2-digit  levels  (Comments  on  Statistical  Measures  and  Tables,  nos  18, 
19,  Census  of  Manufacturers.  1982): 
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"Multiunit  companies  were  instructed  to  report 
for  each  establishment  as  if  it  were  a  separate 
economic  unit  and,  in  particular  to  report  interplant 
transfers  at  their  full  economic  value."   (page  xxi) 
"The  aggregates  of  the  cost  of  materials  and 
value  of  shipments  figures  for  industry  groups  and  all 
manufacturing  industries  includes  large  amounts  of 
duplication  since  the  products  of  some  industries  are 
used  as  materials  by  others.  ......  Because  the 

amount  of  duplication  of  the  cost  of  materials  in  the 
value  of  products  figures  cannot  be  measured  with  any 
degree  of  precision,  caution  is  urged  with  the  use  of 
the  value  of  shipments  total  at  the  two-  and  three- 
digit  industry  group  levels."    (page  xxiii) 
By  contrast,  Compustat  data  is  more  effective  at  these  levels. 

A  second  limitation  of  Census  of  Manufacturers  data  is  that  the 
census  is  conducted  only  every  five  years  (1977,  1982,  1987).   Data  for 
the  intervening  years  are  estimated  by  surveying  a  sample  of  one- 
fourth  of  the  population.   Of  the  years  covered  by  COMPUSTAT  line-of- 
business  database  (1978-86),  only  1982  is  a  census  year;  data  for  all 
other  years  in  that  period  are  estimates. 

Yet  another  limitation  is  that  the  Census  of  Manufacturers  covers 
only  firms  in  the  SIC  range  2000-3999,  and  does  not  provide  comparable 
data  for  non- manufacturing  activities  in  SICs  0100-1999  and  4000-9999. 
Data  is  available  on  those  SIC  groups,  but  from  a  variety  of  sources 
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(Census  of  Mining,  Census  of  Agriculture,  etc.).  thus  comparability  of 
definitions  and  of  time  periods  cannot  be  assumed. 

Using  the  COMPUSTAT  II  line-of -business  data  aggregated  to  the 
industry  level  can  overcome  many  of  the  Census  data  problems.   Among 
the  advantages  of  COMPUSTAT  II  are  that  data  are  reported  annually  for 
all  firms  in  the  database,  that  data  are  readily  available  online,  that 
duplication  (double -counting  of  sales)  as  found  in  the  Census  data  is 
avoided,  and  that  trends  at  the  business,  firm  and  industry  levels  can 
be  studied  with  confidence  that  the  variables'  definitions  are  the  same 
at  all  levels. 

The  major  limitation  on  use  of  COMPUSTAT  II  line-of -business  data 
for  industry  analysis  is  that  the  database  includes  only  the  companies 
traded  on  the  NYSE,  ASE  and  OTC  exchange  (6,007  firms)  while  the  Census 
of  Manufacturing  covers  more  than  220,000  public  and  private  firms. 
However,  if  industry  constructs  are  operationalized  as  trends  rather 
than  as  absolutes,  the  impact  of  COMPUSTAT' s  limited  company  coverage, 
6,007  firms  from  the  total  population,  is  minimized  if  not  eliminated. 
In  addition,  it  should  be  pointed  out  that  the  population  of  publicly 
traded  companies,  of  which  COMPUSTAT  II  is  composed,  represents  almost 
all  the  large  U.S.  firms,  and  those  in  turn  represent  a  significant 
proportion  of  the  output  of  U.S.  business  enterprise.   (The  Census  of 
Manufacturers  has  estimated  that  the  200  largest  manufacturing  firms 
account  for  43%  of  value  added  by  manufacture.   Therefore,  6,007  of  the 
largest  firms  certainly  represent  the  greater  proportion  of  output 
compared  to  those  companies  not  included.)   If  research  questions  under 
consideration  are  such  that  the  firms  studied  are  from  the  Fortune  500, 
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the  proportion  of  industry  membership  represented  among  6,007  of  the 
largest  firms  could  be  safely  assumed  to  constitute  the  relevant 
industry  referent  groups  for  those  firms. 

A  limitation  of  privately  generated  databases,  including  S&P's 
Compustat  Industry  Aggregate  database  and  S&P  Financial  Dynamics' 
Industry  Composites,  is  that  these  data  are  based  on  annually- reported 
firm- level  data.   In  contrast  to  Compustat  II 's  line-of -business  data 
most  such  industry  data  is  not  developed  by  separating  firms  into  their 
diversified  segments  and  therefore,  risks  inaccuracy  by  misattributing 
a  firm's  entire  data  to  its  primary  industry  affiliation. 

There  appear  to  be  systematic  biases  in  both  the  Census  of 
Manufacturers  and  the  COMPUSTAT  II  databases:  the  Census  data  with 
respect  to  duplication  of  shipments  and  5  year  data  collection 
frequency  the  COMPUSTAT  data  with  respect  to  more  limited  company 
coverage.   Researchers  choosing  one  database  or  the  other  may  also  be 
interested  to  know  that  industry  growth  rates  calculated  from  COMPUSTAT 
(as  measured  by  change  in  sales)  and  from  the  Census  of  Manufacturers 
(as  measured  by  change  in  the  value  of  shipments)  showed  a  high  degree 
of  correlation  (more  than  0.70,  significant  at  the  .001  level). 

USE  OF  COMPUSTAT  DATA  TO  STUDY  BUSINESS  RELAIEDNESS 

The  segment  SICs  in  the  COMPUSTAT  line  of  business  database  can  be 
effectively  and  efficiently  utilized  to  evaluate  "relatedness"  in 
firms'  diversification  strategies.   Rumelt  (1974)  and  many  researchers 
following  him  have  used  methods  that  differentiate  between  related  and 
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unrelated  diversification  in  categoric  terms.   Berry  (1974),  Jacquemin 
and  Berry  (1979),  Montgomery  (1982),  and  Palepu  (1985)  have  all  used 
continuous  measures  or  indices  to  evaluate  total  diversification, 
without  comparing  related  and  unrelated  diversif iers .   Berry  (1974)  and 
Montgomery  (1982)  used  a  variant  of  the  Herfindahl  index  of  industry 
concentration  to  measure  firms'  total  diversification.   Jacquemin  and 
Berry  (1979)  developed  an  entropy-based  measure  of  diversification, 
later  used  by  Palepu  (1985).   The  entropy  measure  used  by  these 
researchers  measured  total  diversification  (DT)  as  the  sum  of  two 
indices  (DR  +  DU) ,  such  that  DT  (total  diversification)  -  DR  (related 
diversification)  +  DU  (unrelated  diversification) . 

The  COMPUSTAT  line  of  business  database  lends  itself  to  measuring 
relatedness  by  any  of  these  methods.   However,  as  discussed  at  some 
length  in  a  preceding  section  of  this  paper,  segments  should  be  kept 
intact  by  researchers  addressing  many  of  these  questions,  even  if  the 
SSICs  differ  greatly.   (The  firm  has  already  defined  a  segment  as 
comprising  of  related  activities,  therefore,  it  would  be  erroneous  for 
researchers  to  split  up  segments.)   For  purposes  of  the  Herfindahl  and 
entropy  index  measures,  SSIC1  should  be  considered  as  the  primary  SIC 
of  the  segment  in  accordance  with  the  recommendation  of  S&P's  COMPUSTAT 
II  documentation.   Table  4  shows  the  indices  for  total  diversification 
(DT,  DR  and  DU)  produced  by  each  of  these  methods  using  COMPUSTAT  line 
of  business  data  for  three  firms,  as  well  as  categoric  classifications 
(Rumelt,  1974)  for  the  same  firms. 


Insert  Table  4  about  here 
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(Segment  SICs,  segment  names  (descriptive)  and  other  information  are 
often  used  as  the  basis  for  calculating  the  related,  specialization  and 
vertical  ratios  for  these  categoric  classification) . 

Researchers  in  strategic  management  are  concerned  with  evaluating 
both  extent  and  type  of  firms'  diversification.   The  diversification 
measures  outlined  above  address  either  extent  or  type .  but  not  both. 
For  example,  Table  4  shows  that  the  entropy  index  value  for  Honeywell 
and  American  Home  Products,  Inc.,  are  very  close  in  value,  1.36  and 
1.31,  respectively.   However,  it  is  also  clear  that  these  values  do  not 
satisfactorily  express  the  difference  in  type  of  diversification  in 
these  companies. 

We  argue  that  a  variant  of  the  entropy  measure  could  address  both 
of  these  needs.   We  suggest  an  index  of  type  of  diversification  which 
takes  into  consideration  the  difference  between  DU  and  DR.   The 
following  illustration  will  show  the  power  of  this  simple  variant  of 
the  entropy  measure  in  depicting  type  of  diversification.   Calculating 
DD  (difference  in  diversification  types)  as  DD  -  DU  -  DR,  a  researcher 
is  then  able  to  observe  that  a  negative  value  of  DD  signifies  a  greater 
level  of  related  diversification,  while  a  positive  value  of  DD 
signifies  a  greater  level  of  unrelated  diversification,  and  values 
around  zero  suggest  a  balance  between  related  and  unrelated 
diversification.   The  values  for  DD  shown  in  Table  4  suggest  that 
American  Home  Products  has  a  high  level  of  related  diversification 
while  Honeywell  is  evenly  balanced  between  unrelated  and  related,  and 
ITT's  high  level  of  unrelated  diversification  is  clearly  evident.   We 
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therefore  argue  that  while  DT  does  provide  a  measure  of  the  extent  of 
strategic  diversity  in  firms,  DD  provides  a  much-needed  measure  of  the 
type  of  diversification  involved.   The  DD  measure  should  be  used 
instead  of  DT  when  type  of  diversification  is  the  research  issue,  and 
together  with  DT  (perhaps  combined  into  an  index)  when  both  type  and 
extent  of  diversification  are  of  research  interest.   Use  of  the  DD 
measure  can  help  overcome  the  problem  of  large  within- group  variations, 
which  are  characteristic  of  methods  employing  group  classification 
schema. 

Researchers  studying  dynamic  aspects  of  diversification  would 
benefit  from  using  our  DD-DT  measure,  as  these  continuous  measures  are 
more  sensitive  to  changes  in  strategic  diversity  than  are  broad 
categoric  classifications.   Finally,  the  DD-DT  measures  can  be  more 
readily  replicated  by  researchers  than  can  subjective  categoric 
classifications . 

CONCLUSION 

We  have  considered  the  use  of  line -of -business  data  for  three 
issues  of  significant  interest  to  strategic  management  practitioners 
and  researchers:  assessment  of  strategic  diversity,  analysis  of 
industry  trends,  and  evaluation  of  the  presence  of  vertical 
integration.   Based  on  the  analysis  reported  in  this  paper,  we  conclude 
(1)  that  the  Compustat  II  line-of -business  database  provides  an 
efficient  and  effective  source  of  data  for  such  questions,  but  (2)  that 
certain  caveats  apply  to  use  of  that  data  and  must  be  observed  to  avoid 
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erroneous  research  results.   A  summary  of  those  uses  and  caveats 
follow. 

First,  for  such  questions  as  measurement  of  diversity,  a  segment 
should  not  be  considered  to  comprise  of  two  unrelated  components, 
despite  the  presence  of  two  seemingly  diverse  segment  SICs,  because  the 
firm  has  already  declared  some  relatedness  through  their  segmentation. 
With  this  caveat  observed,  the  database  can  be  quite  effectively  used 
to  calculate  Herfindahl,  entropy  and  other  measures. 

Second,  we  find  Compustat  II  line-of -business  to  be  quite 
satisfactory  for  the  study  of  industry  trends,  assuming  the  above 
caveat  is  observed.   With  increasing  proportion  of  industry  output 
originating  from  highly  diversified  firms,  accurate  data  for  industry- 
level  questions  has  been  difficult  to  obtain.   The  drawbacks  of  Census 
data  were  discussed  above,  as  were  those  of  currently  available 
industry  aggregate  data  from  private  services.   Compustat  II  provides 
readily  accessible  data,  disaggregated  from  diversified  firms  to  the 
business  level,  which  can  then  be  re-aggregated  by  industry. 

Third,  Compustat  II  is  an  unexploited  resource  for  research  on 
vertical  integration.   As  explained  in  this  paper,  with  proper  use  of 
this  data  set,  researchers  can  detect  vertical  integration  not  only 
within  firms  but  also  within  segments  of  firms. 

Researcher  observing  the  restrictions  and  recommendations  outlined 
in  this  paper  for  proper  use  of  the  Compustat  II  data  set  can  proceed 
with  greater  confidence  to  use  Compustat  II  to  address  important 
research  questions  for  which  appropriate  data  were  previously 
unavailable. 
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TABLE  4 


Entropy  Measures  of  Selected  Firms 


Co .  Name 

S  SIC 

Sep  Sales 

Gp  Sales 

DU   DR   DT     DD 

Cate- 

Tot Sales 

Tot  Sales 

(DU+DR)  (DU-DR) 

goric 
Class 

Am .  Home 

2834 

39.7 

Prod. , 

2834 
2842 

13.3 
27.0 

80.0 

2032 

20.0 

20.0 

,69  .67   1.36    .02 

R 

Honeywell 

3822 
3823 
3664 

24.1 
21.7 
19.7 

45.8 

3680 

34.5 

54.2 

.50  .81   1.31    -.31 

R 

ITT 

3661 
3663 
3679 
3651 
3823 
2051 
7011 
2611 

32.6 
4.3 
6.0 
6.0 
16.5 
11.2 
6.4 
6.8 

48.9 

16.5 

11.2 

6.4 

6.8 

3714 

10.1 

10.1 

1 

.48  .49    1.97     .99 

U 

R  -   related  diversification 
U  -   unrelated  diversification 

DU,  an  index  value  of  unrelated  diversification,  is  the  weighted 

average  of  all  group  shares  across  which  the  firm  participates. 
Each  group  gets  a  weight  equal  to  its  share  in  the  total 
operations  of  the  firm,  ie.,   [(gp.  sales/tot.  sales)  *  ln(tot. 
sales/gp . sales) ] 

DR,  an  index  value  of  related  diversification,  is  a  similar  weighted 

average  of  the  related  diversification  across  segments  within  all 
industry  groups  in  which  the  firm  participates. 
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